Model Selection

INT8 Quantization

# INT8 Quantization

Bytedance BAGEL 7B MoT INT8

BAGEL is an open-source 7B active parameter multimodal foundation model supporting multimodal understanding and generation tasks

Meta Llama 3.1 8B Instruct Quantized.w8a8

This is the INT8 quantized version of the Meta-Llama-3.1-8B-Instruct model, optimized through weight and activation quantization, suitable for multilingual business and research applications.

Large Language Model

Transformers Supports Multiple Languages

Qwq 32B INT8 W8A8

INT8 quantized version of QWQ-32B, optimized by reducing the bit-width of weights and activations

Large Language Model

Transformers English

Qwen2.5 VL 7B Instruct Quantized.w8a8

Quantized version of Qwen2.5-VL-7B-Instruct, supporting vision-text input and text output, optimized for inference efficiency through INT8 weight quantization

Transformers English

Deepseek R1 Distill Qwen 14B Quantized.w8a8

The quantized version of DeepSeek-R1-Distill-Qwen-14B, optimized with INT8 quantization for weights and activations, reducing GPU memory requirements and improving computational efficiency.

Large Language Model

FLUX.1 Dev Qint8

FLUX.1-dev is a text-to-image diffusion model quantized to INT8 format using Optimum Quanto, suitable for non-commercial use.

Text-to-Image English

Bge Large En V1.5 Quant

Quantized (INT8) ONNX variant of BGE-large-en-v1.5 with inference acceleration via DeepSparse

Transformers English

Roberta Base Go Emotions Onnx

This is the ONNX version of the RoBERTa-base-go_emotions model, supporting full precision and INT8 quantization for multi-label emotion analysis tasks.

Text Classification

Transformers English

Distilbert Base Uncased Distilled Squad Int8 Static Inc

This is the INT8 quantized version of the DistilBERT base uncased model, specifically designed for question answering tasks, optimized for model size and inference speed through post-training static quantization.

Question Answering System

Bert Large Uncased Whole Word Masking Squad Int8 0001

BERT-large English Q&A model pre-trained with whole word masking and fine-tuned on SQuAD v1.1, quantized to INT8 precision

Question Answering System

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase